Analyzing the Social Structure and Dynamics of E-mail and Spam in Massive Backbone Internet Traffic
نویسندگان
چکیده
E-mail is probably the most popular application on the Internet, with everyday business and personal communications dependent on it. Spam or unsolicited e-mail has been estimated to cost businesses significant amounts of money. However, our understanding of the network-level behavior of legitimate e-mail traffic and how it differs from spam traffic is limited. In this study, we have passively captured SMTP packets from a 10 Gbit/s Internet backbone link to construct a social network of e-mail users based on their exchanged emails. The focus of this paper is on the graph metrics indicating various structural properties of e-mail networks and how they evolve over time. This study also looks into the differences in the structural and temporal characteristics of spam and non-spam networks. Our analysis on the collected data allows us to show several differences between the behavior of spam and legitimate e-mail traffic, which can help us to understand the behavior of spammers and give us the knowledge to statistically model spam traffic on the network-level in order to complement current spam detection techniques.
منابع مشابه
Structural and Temporal Properties of E-mail and Spam Networks
In this paper we present a large-scale measurement study and analysis of e-mail traffic collected on an Internet backbone link. To the best of our knowledge this is one of the largest studies of network-wide behavior of e-mail traffic. We consider e-mail networks connecting senders and receivers that have communicated via e-mail, capturing their social interactions. Our study focuses on tempora...
متن کاملDetection of Spam Hosts and Spam Bots Using Network Flow Traffic Modeling
In this paper, we present an approach for detecting e-mail spam originating hosts, spam bots and their respective controllers based on network flow data and DNS metadata. Our approach consists of first establishing SMTP traffic models of legitimate vs. spammer SMTP clients and then classifying unknown SMTP clients with respect to their current SMTP traffic distance from these models. An entropy...
متن کاملA Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization
Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...
متن کاملA New Hybrid Approach of K-Nearest Neighbors Algorithm with Particle Swarm Optimization for E-Mail Spam Detection
Emails are one of the fastest economic communications. Increasing email users has caused the increase of spam in recent years. As we know, spam not only damages user’s profits, time-consuming and bandwidth, but also has become as a risk to efficiency, reliability, and security of a network. Spam developers are always trying to find ways to escape the existing filters therefore new filters to de...
متن کاملAnalyzing Large Collections of Email
One of the first applications of the Internet was the electronic mailing (e-mail). Along with the evolution of the Internet, e-mail has evolved into a powerful and popular technology. Messages, electronically documents, pictures and even movies can be send between users of computer systems at different places all over the world within seconds. Electronic mail is a fast, a cheap and a comfortabl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1008.3289 شماره
صفحات -
تاریخ انتشار 2010